skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Alali, Abrar"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Recent statistics reveal an alarming increase in accidents involving pedestrians (especially children) crossing the street. A common philosophy of existing pedestrian detection approaches is that this task should be undertaken by the moving cars themselves. In sharp departure from this philosophy, we propose to enlist the help of cars parked along the sidewalk to detect and protect crossing pedestrians. In support of this goal, we propose ADOPT: a system for Alerting Drivers to Occluded Pedestrian Traffic. ADOPT lays the theoretical foundations of a system that uses parked cars to: (1) detect the presence of a group of crossing pedestrians – a crossing cohort; (2) predict the time the last member of the cohort takes to clear the street; (3) send alert messages to those approaching cars that may reach the crossing area while pedestrians are still in the street; and, (4) show how approaching cars can adjust their speed, given several simultaneous crossing locations. Importantly, in ADOPT all communications occur over very short distances and at very low power. Our extensive simulations using SUMO-generated pedestrian and car traffic have shown the effectiveness of ADOPT in detecting and protecting crossing pedestrians. 
    more » « less
  2. Tracking subjects in videos is one of the most widely used functions in camera-based IoT applications such as security surveillance, smart city traffic safety enhancement, vehicle to pedestrian communication and so on. In computer vision domain, tracking is usually achieved by first detecting subjects, then associating detected bounding boxes across video frames. Typically, frames are transmitted to a remote site for processing, incurring high latency and network costs. To address this, we propose ViFiT, a transformerbased model that reconstructs vision bounding box trajectories from phone data (IMU and Fine Time Measurements). It leverages a transformer’s ability of better modeling long-term time series data. ViFiT is evaluated on Vi-Fi Dataset, a large-scale multimodal dataset in 5 diverse real world scenes, including indoor and outdoor environments. Results demonstrate that ViFiT outperforms the state-of-the-art approach for cross-modal reconstruction in LSTM Encoder-Decoder architecture X-Translator and achieves a high frame reduction rate as 97.76% with IMU and Wi-Fi data. 
    more » « less
  3. In this paper, we present ViTag to associate user identities across multimodal data, particularly those obtained from cameras and smartphones. ViTag associates a sequence of vision tracker generated bounding boxes with Inertial Measurement Unit (IMU) data and Wi-Fi Fine Time Measurements (FTM) from smartphones. We formulate the problem as association by sequence to sequence (seq2seq) translation. In this two-step process, our system first performs cross-modal translation using a multimodal LSTM encoder-decoder network (X-Translator) that translates one modality to another, e.g. reconstructing IMU and FTM readings purely from camera bounding boxes. Second, an association module finds identity matches between camera and phone domains, where the translated modality is then matched with the observed data from the same modality. In contrast to existing works, our proposed approach can associate identities in multi-person scenarios where all users may be performing the same activity. Extensive experiments in real-world indoor and outdoor environments demonstrate that online association on camera and phone data (IMU and FTM) achieves an average Identity Precision Accuracy (IDP) of 88.39% on a 1 to 3 seconds window, outperforming the state-of-the-art Vi-Fi (82.93%). Further study on modalities within the phone domain shows the FTM can improve association performance by 12.56% on average. Finally, results from our sensitivity experiments demonstrate the robustness of ViTag under different noise and environment variations. 
    more » « less
  4. null (Ed.)
    We demonstrate an application of finding target persons on a surveillance video. Each visually detected participant is tagged with a smartphone ID and the target person with the query ID is highlighted. This work is motivated by the fact that establishing associations between subjects observed in camera images and messages transmitted from their wireless devices can enable fast and reliable tagging. This is particularly helpful when target pedestrians need to be found on public surveillance footage, without the reliance on facial recognition. The underlying system uses a multi-modal approach that leverages WiFi Fine Timing Measurements (FTM) and inertial sensor (IMU) data to associate each visually detected individual with a corresponding smartphone identifier. These smartphone measurements are combined strategically with RGB-D information from the camera, to learn affinity matrices using a multi-modal deep learning network. 
    more » « less